Goto

Collaborating Authors

 borel isomorphism


Borel Isomorphic Dimensionality Reduction of Data and Supervised Learning

arXiv.org Machine Learning

In this project we further investigate the idea of reducing the dimensionality of datasets using a Borel isomorphism with the purpose of subsequently applying supervised learning algorithms, as originally suggested by my supervisor V. Pestov (in 2011 Dagstuhl preprint). Any consistent learning algorithm, for example kNN, retains universal consistency after a Borel isomorphism is applied. A series of concrete examples of Borel isomorphisms that reduce the number of dimensions in a dataset is provided, based on multiplying the data by orthogonal matrices before the dimensionality reducing Borel isomorphism is applied. We test the accuracy of the resulting classifier in a lower dimensional space with various data sets. Working with a phoneme voice recognition dataset, of dimension 256 with 5 classes (phonemes), we show that a Borel isomorphic reduction to dimension 16 leads to a minimal drop in accuracy. In conclusion, we discuss further prospects of the method.


Is the k-NN classifier in high dimensions affected by the curse of dimensionality?

arXiv.org Machine Learning

Is the k-NN classifier in high dimensions affected by the curse of dimensionality? Abstract There is an increasing body of evidence suggesting that exact nearest neighbour search in high-dimensional spaces is affected by the curse of dimensionality at a fundamental level. Does it necessarily mean that the same is true for k nearest neighbours based learning algorithms such as the k-NN classifier? We analyse this question at a number of levels and show that the answer is different at each of them. As our first main observation, we show the consistency of a k approximate nearest neighbour classifier. However, the performance of the classifier in very high dimensions is provably unstable. As our second main observation, we point out that the existing model for statistical learning is oblivious of dimension of the domain and so every learning problem admits a universally consistent deterministic reduction to the one-dimensional case by means of a Borel isomorphism.